智能论文笔记

Cryo-shift: Reducing domain shift in cryo-electron subtomograms with unsupervised domain adaptation and randomization

Hmrishav Bandyopadhyay , Zihao Deng , Leiting Ding , Sinuo Liu , Mostofa Rafid Uddin , Xiangrui Zeng , Sima Behpour , Min Xu

分类：计算机视觉 | 机器学习

2021-11-17

冷冻电子断层扫描（Cryo-et）是一种3D成像技术，可以在近原子分辨率下原位地置于亚细胞结构。细胞冷冻剂图像有助于解决大分子的结构并在单个细胞中确定它们的空间关系，这对细胞和结构生物学具有广泛的意义。体摩数分类和识别构成了这些大分子结构的系统恢复的主要步骤。已被证明监督深度学习方法对重组分类进行高度准确和高效，而是由于缺乏注释数据而受到有限的适用性。虽然生成用于训练监督模型的模拟数据是潜在的解决方案，但与真实实验数据相比，生成数据中的图像强度分布的相当差异将导致训练有素的模型在预测真实错误谱图上预测类别中的差。在这项工作中，我们呈现了低温，一个完全无监督的域适应和随机化框架，用于深入学习的跨域重组分类。我们使用无监督的多逆境域适应来减少模拟和实验数据的特征之间的域移位。我们使用“翘曲”模块开发网络驱动的域随机化过程，以改变模拟数据，并帮助分类器在实验数据上更好地推广。我们不使用任何标记的实验数据来训练我们的模型，而一些现有的替代方法需要标记为跨域分类的实验样本。然而，在本文在本文中，使用两种模拟和实验数据在本文中显示的广泛评估研究中的横域重组分类中现有的替代方法的优先效果优异。

translated by 谷歌翻译

Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study

Di Wu , Wasi Uddin Ahmad , Kai-Wei Chang

分类：自然语言处理

2022-12-20

Neural models that do not rely on pre-training have excelled in the keyphrase generation task with large annotated datasets. Meanwhile, new approaches have incorporated pre-trained language models (PLMs) for their data efficiency. However, there lacks a systematic study of how the two types of approaches compare and how different design choices can affect the performance of PLM-based models. To fill in this knowledge gap and facilitate a more informed use of PLMs for keyphrase extraction and keyphrase generation, we present an in-depth empirical study. Formulating keyphrase extraction as sequence labeling and keyphrase generation as sequence-to-sequence generation, we perform extensive experiments in three domains. After showing that PLMs have competitive high-resource performance and state-of-the-art low-resource performance, we investigate important design choices including in-domain PLMs, PLMs with different pre-training objectives, using PLMs with a parameter budget, and different formulations for present keyphrases. Further results show that (1) in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models; (2) with a fixed parameter budget, prioritizing model depth over width and allocating more layers in the encoder leads to better encoder-decoder models; and (3) introducing four in-domain PLMs, we achieve a competitive performance in the news domain and the state-of-the-art performance in the scientific domain.

translated by 谷歌翻译

PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English

Jianfeng Chi , Wasi Uddin Ahmad , Yuan Tian , Kai-Wei Chang

分类：自然语言处理

2022-12-20

Privacy policies provide individuals with information about their rights and how their personal information is handled. Natural language understanding (NLU) technologies can support individuals and practitioners to understand better privacy practices described in lengthy and complex documents. However, existing efforts that use NLU technologies are limited by processing the language in a way exclusive to a single task focusing on certain privacy practices. To this end, we introduce the Privacy Policy Language Understanding Evaluation (PLUE) benchmark, a multi-task benchmark for evaluating the privacy policy language understanding across various tasks. We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training. We demonstrate that domain-specific pre-training offers performance improvements across all tasks. We release the benchmark to encourage future research in this domain.

translated by 谷歌翻译

CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context

Yangruibo Ding , Zijian Wang , Wasi Uddin Ahmad , Murali Krishna Ramanathan , Ramesh Nallapati , Parminder Bhatia , Dan Roth , Bing Xiang

分类：自然语言处理

2022-12-20

While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i.e., in-file context, but ignore the rich semantics in other files within the same project, i.e., cross-file context, a critical source of information that is especially useful in modern modular software development. Such overlooking constrains code language models' capacity in code completion, leading to unexpected behaviors such as generating hallucinated class member functions or function calls with unexpected arguments. In this work, we develop a cross-file context finder tool, CCFINDER, that effectively locates and retrieves the most relevant cross-file context. We propose CoCoMIC, a framework that incorporates cross-file context to learn the in-file and cross-file context jointly on top of pretrained code LMs. CoCoMIC successfully improves the existing code LM with a 19.30% relative increase in exact match and a 15.41% relative increase in identifier matching for code completion when the cross-file context is provided.

translated by 谷歌翻译

A Dependable Hybrid Machine Learning Model for Network Intrusion Detection

Md. Alamin Talukder , Khondokar Fida Hasan , Md. Manowarul Islam , Md Ashraf Uddin , Arnisha Akhter , Mohammand Abu Yousuf , Fares Alharbi , Mohammad Ali Moni

分类：机器学习

2022-12-08

Network intrusion detection systems (NIDSs) play an important role in computer network security. There are several detection mechanisms where anomaly-based automated detection outperforms others significantly. Amid the sophistication and growing number of attacks, dealing with large amounts of data is a recognized issue in the development of anomaly-based NIDS. However, do current models meet the needs of today's networks in terms of required accuracy and dependability? In this research, we propose a new hybrid model that combines machine learning and deep learning to increase detection rates while securing dependability. Our proposed method ensures efficient pre-processing by combining SMOTE for data balancing and XGBoost for feature selection. We compared our developed method to various machine learning and deep learning algorithms to find a more efficient algorithm to implement in the pipeline. Furthermore, we chose the most effective model for network intrusion based on a set of benchmarked performance analysis criteria. Our method produces excellent results when tested on two datasets, KDDCUP'99 and CIC-MalMem-2022, with an accuracy of 99.99% and 100% for KDDCUP'99 and CIC-MalMem-2022, respectively, and no overfitting or Type-1 and Type-2 issues.

translated by 谷歌翻译

Automated Level Crossing System: A Computer Vision Based Approach with Raspberry Pi Microcontroller

Rafid Umayer Murshed , Sandip Kollol Dhruba , Md. Tawheedul Islam Bhuian , Mst. Rumi Akter

分类：计算机视觉

2022-12-08

In a rapidly flourishing country like Bangladesh, accidents in unmanned level crossings are increasing daily. This study presents a deep learning-based approach for automating level crossing junctions, ensuring maximum safety. Here, we develop a fully automated technique using computer vision on a microcontroller that will reduce and eliminate level-crossing deaths and accidents. A Raspberry Pi microcontroller detects impending trains using computer vision on live video, and the intersection is closed until the incoming train passes unimpeded. Live video activity recognition and object detection algorithms scan the junction 24/7. Self-regulating microcontrollers control the entire process. When persistent unauthorized activity is identified, authorities, such as police and fire brigade, are notified via automated messages and notifications. The microcontroller evaluates live rail-track data, and arrival and departure times to anticipate ETAs, train position, velocity, and track problems to avoid head-on collisions. This proposed scheme reduces level crossing accidents and fatalities at a lower cost than current market solutions. Index Terms: Deep Learning, Microcontroller, Object Detection, Railway Crossing, Raspberry Pi

translated by 谷歌翻译

Can Ensemble of Classifiers Provide Better Recognition Results in Packaging Activity?

A. H. M. Nazmus Sakib , Promit Basak , Syed Doha Uddin , Shahamat Mustavi Tasin , Md Atiqur Rahman Ahad

分类：计算机视觉 | 机器学习

2022-11-05

Skeleton-based Motion Capture (MoCap) systems have been widely used in the game and film industry for mimicking complex human actions for a long time. MoCap data has also proved its effectiveness in human activity recognition tasks. However, it is a quite challenging task for smaller datasets. The lack of such data for industrial activities further adds to the difficulties. In this work, we have proposed an ensemble-based machine learning methodology that is targeted to work better on MoCap datasets. The experiments have been performed on the MoCap data given in the Bento Packaging Activity Recognition Challenge 2021. Bento is a Japanese word that resembles lunch-box. Upon processing the raw MoCap data at first, we have achieved an astonishing accuracy of 98% on 10-fold Cross-Validation and 82% on Leave-One-Out-Cross-Validation by using the proposed ensemble model.

translated by 谷歌翻译

Analysis and prediction of heart stroke from ejection fraction and serum creatinine using LSTM deep learning approach

Md Ershadul Haque , Salah Uddin , Md Ariful Islam , Amira Khanom , Abdulla Suman , Manoranjan Paul

分类：计算机视觉 | 机器学习

2022-09-28

大数据和深度学习的结合是一项破坏世界的技术，如果正确使用，可以极大地影响任何目标。随着深度学习技术中大量医疗保健数据集和进步的可用性，系统现在可以很好地预测任何健康问题的未来趋势。从文献调查中，我们发现SVM用于预测心力衰竭的情况，而无需关联客观因素。利用电子健康记录（EHR）中重要历史信息的强度，我们利用长期记忆（LSTM）建立了一个智能和预测的模型，并根据该健康记录预测心力衰竭的未来趋势。因此，这项工作的基本承诺是使用基于患者的电子药用信息的LSTM来预测心脏的失败。我们已经分析了一个数据集，该数据集包含在Faisalabad心脏病学研究所和Faisalabad（巴基斯坦旁遮普邦）的盟军医院收集的299例心力衰竭患者的病历。这些患者由105名女性和194名男性组成，年龄在40岁和95岁之间。该数据集包含13个功能，这些功能报告了负责心力衰竭的临床，身体和生活方式信息。我们发现我们的分析趋势越来越多，这将有助于促进心中预测领域的知识。

translated by 谷歌翻译

Digital Twin in Safety-Critical Robotics Applications: Opportunities and Challenges

Sabur Baidya , Sumit K. Das , Mohammad Helal Uddin , Chase Kosek , Chris Summers

分类：机器人

2022-09-26

数字双技术被认为是现代工业发展的组成部分。随着技术Internet技术（IoT）技术的快速发展以及自动化趋势的增加，虚拟世界与物理世界之间的整合现在可以实现生产实用的数字双胞胎。但是，数字双胞胎的现有定义是不完整的，有时是模棱两可的。在此，我们进行了历史审查，并分析了数字双胞胎的现代通用观点，以创建其新的扩展定义。我们还审查并讨论了在安全至关重要的机器人技术应用中数字双胞胎中现有的工作。特别是，由于环境挑战，数字双胞胎在工业应用中的使用需要自动和远程操作。但是，环境中的不确定性可能需要对机器人进行仔细监控和快速适应，这些机器人需要防止安全和成本效益。我们展示了一个案例研究，以开发针对安全至关重要的机器人臂应用框架，并提出系统性能以显示其优势，并讨论未来的挑战和范围。

translated by 谷歌翻译

Edge-assisted Collaborative Digital Twin for Safety-Critical Robotics in Industrial IoT

Sumit K. Das , Mohammad Helal Uddin , Sabur Baidya

分类：机器人

2022-09-26

Digital Twin Technology在现代工业发展中起着关键作用。尤其是，随着技术的技术进步（IoT）以及自主权的日益增长的趋势，配备多传感器的机器人技术可以创建实用的数字双胞胎，这在运营，维护和安全的工业应用程序中特别有用。在此，我们演示了一个现实世界中的数字双胞胎，其中包括安全至关重要的机器人应用程序，并带有Franka-Emika-Panda机器人臂。我们开发并展示了一个避免动态障碍物的边缘辅助协作数字双胞胎，这对于在工业物联网中不确定和动态的环境中运行时可以实时适应机器人。

translated by 谷歌翻译